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Correlation testing provides a quick method of discriminating amongst potential terms to include 
in a nuclear mass formula or functional; however a firm mathematical foundation of the method has 
not been previously set forth. Here, the necessary justification for correlation testing is developed and 
more detail of the motivation behind its use is given. We provide a quantitative demonstration of the 
method's performance and short-comings, highlighting also potential issues a user may encounter. 
In concluding we suggestion some possible future developments to improve the limitations of the 
method. 
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I. INTRODUCTION 

Nuclear structure lies at the heart of many crucial 
problems, ranging from nuclear technology to stellar nu- 
cleosynthesis, and it's theories provide a means to ex- 
plore those systems yet unattainable experimentally. A 
fundamental aspect of nuclear structure, nuclear masses, 
has had a rich history of study. Since the first work 
by Bethe and Weizsacker nuclear theory sought to 
provide a consistent and global description of nuclear 
masses in a concise form. To this end, much progress 
has been made in the more than seventy year history. 
Global approaches include ab intio approaches such as 
no-core shell model and Green function Monte Carlo 
0, H[, mean- field approaches such as Skyrme-Hartree- 
Fock-Bogoliubov and density functional theory (S-lllj. 
microscopic-macroscopic approaches such as the finite- 
range droplet model (FRDM) [l2j, and formulaic ap- 
proaches such as the Duflo-Zuker mass formula 13] (for 
a review see Ref. [HI). ^ n particular this work addresses 
those approaches such as the Duflo-Zuker mass formula 
or density functional theory of Refs. [8|,|9|, as these meth- 
ods rely on the development of a function or functional 
to accurately describe the relationship between nuclear 
masses and a set of appropriate degrees of freedom. 

Indeed, nuclear density functional theory (DFT) has 
proven to be a useful tool for globally describing the 
ground-state properties of nuclei [3, E3, El E3 ■ This ap- 
proach is based on the pioneering works of Skyrme [la ] 
and Vautherin and Brink [r| [l7| and finds theoretical 
basis on the theorems by Hohenberg and Kohn [l8j]. 
It has been applied through the self-consistent mean- 
field computations with density-dependent energy func- 
tionals [19| • Recently, developments in global nuclear 
energy-density functionals have shown great progress. 
The Skyrme-Hartree-Fock-Bogoliubov mass functionals 
by Goriely et al. [2(| achieve a least-squares error of about 
X = 0.58 MeV to nuclei with N, Z > 8. The UNEDF 
functionals of Ref. [8j have a least-squares deviation on 
the order of \ — 0.97 MeV and \ = 1-46 MeV. A review 
on this matter, is available in Ref. 

While DFT allows for a simple solution to the quan- 
tum many-body problem, the construction of the func- 



tional itself poses a challenge [H 0] . Various efforts aim 
at constraining the functional from microscopic interac- 
tions [2l| or devising a systematic approach [22l . Though 
many approaches are emperical in nature I23l - l26l |. guid- 
ance is found from, for instance, Refs. |27H23 ]. which de- 
rived energy-density functionals for dilute Fermi gases 
with short-ranged interactions, or Refs. foL [3fj| - l3~3 ] . which 
suggest simple scaling arguments may guide the form of 
the functional. 

Furthermore, one may view the functional as a solution 
to the least-squares problem, 
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where terms of the functional are vectors in the functional 
space. In Eq. ([T]) /(x^) is the theoretical ground-state en- 
ergy of the i th nucleus provided by the functional, where 
the functional is fit to N pts experimental ground-state 
energies, E cxp . Guidance may be drawn from a detailed 
numerical analysis of the functional terms as a solution to 
Eq. (fT|) . Bcrtsch et al. [35[ employed eigen decomposition 
to study the importance of various linear combinations of 
terms that enter the functional. This method identifies 
the relative importance of possible combinations of terms 
and truncates search directions that are flat in the func- 
tional space. Terms of similar importance redundantly 
represent a vector in the functional space. Along these 
ideas, the author's previous work, Ref. [9j, employs a cor- 
relation test that chooses new functional terms based on 
their correlation to terms already present. New terms are 
chosen to provide a relatively independent search direc- 
tion in the functional space along which the least-squares 
error is minimized. 

While the correlation test was introduced in Ref. @ 
as a computationally cheap method for the development 
of an energy functional for nuclear masses, there has not 
been a detailed description of its properties, motivation 
and performance. It is the goal of this paper to revisit 
the systematic approach for choosing functional terms by 
providing the necessary formal justification and charac- 
terization. In Sect. |TT] we briefly describe the method of 
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Ref. Q and provide its motivation along with a formal 
mathematical foundation of correlations as an indication 
of linear independence. In Sect. Mil some quantitative 
results will demonstrate the performance of the method 
and in Sect. IIVI its limitations will be shown and dis- 
cussed. Finally, in Sect. [V] conclusions will be made, 
suggesting some ways in which the method may be im- 
proved. 

II. MOTIVATION 



one must consider searching through hundreds, if not 
thousands, of possible terms to include in a functional. 
They nevertheless served as the initial motivation under- 
lying the method of correlation testing. 

The use of correlations is also not a new concept to op- 
timization, where it is used frequently in the statistical 
analysis of function fits and is related to the sensitivity 
and error of a given parameter set [1, |43|, |44[ ■ The vari- 
ance of a quantity (such as a fit parameter c in a function) 
is given by 



The technique of projecting out linearly dependent vec- 
tors is not new to the field of optimization. The principle 
axis method is frequently employed in the optimization 
of functions for which derivatives are prohibitively ex- 
pensive to calculate or unavailable [36l \3l\ , and is based 
on the concept of orthogonal search directions. 

The principle axis method is rooted in Taylor's the- 
orem, whereby any function ma y b e approximated as a 
quadratic, about some point Xq [3Sj , 



/(x) ~ /(x ) + (x-x ) T J(x) 



i(x-x ) T V 2 /(x) 



(x - x ), 



(2) 



where </(x) is the Jacobian of /(x), and the principle axis 
theorem which states any quadratic form can be put in 
the form Q(x) [39J 



Q(x) = x r Ax 



(3) 



where A is an orthogonally diagonalizable symmetric ma- 
trix. 

In the principle axis method a set of search directions 
u = u\,...,u n in the rt-dimensional function space are 
updated iteratively to provide a new set of search direc- 
tions u 



l) • 



<x' n such that after n iterations all v! i are 
mutually orthogonal with respect to a quadratic form of 
the function (3(| [37j ■ 

The orthogonal search direction contains no linear de- 
pendence and is obtained from an eigen decomposition of 
A [39] . This search direction corresponds to the orthonor- 
mal eigenbasis of A [39] and minimizes the quadratic form 
exactly. For a proof see Ref. 37[ . Reference [36jj employs 
a singular value decomposition on the search direction 
u to insure linear dependence is projected out, citing 
greater efficiency over eigen decompositions. The method 
has been used extensively in many fields including psy- 
chology, image processing, and machine learning [40|-|42j . 
It is available as packaged optimization routines such as 
PRinciple AXIS (PRAXIS) [36| . In nuclear physics, the 
authors of Ref. [35[ have utilized the concept of principle 
axes when treating the terms of a functional as vectors 
in the functional's space, again identifying independence 
through an eigen decomposition. These well-known ap- 
proaches, however, can be prohibitively expensive when 



var(c) 



(4) 



where (•) indicates the average, calculated over the ap- 
propriate distribution for c (43l . IZBj , and is a measure of 
the dispersion of c for its distribution [43] . The standard 
deviation s c is defined from the variance, s c — -\/var(c) 
and may be interpreted as the uncertainty in c. Further- 
more, the variance may be used to provide a confidence 
interval for c. The 1 — a confidence interval is the re- 
gion of c in which we expect the "true" value of c to be, 
with probability P(c truc G O) = |1 -a\ [1, El, 13 . Where 
c if found by fitting to N npt s data points, the confidence 
interval half-width w is defined through: 



\w-c\< sj var(c)f A r pt8 _ li i_ Q , /2 



(5) 



where tjv tB _i t i_ a /2 is the 1 — a/2 quantile of the t- 
distribution with 7V p t s degrees of freedom [l^, S3 ■ That 
is, we expect the "true" value Ctruo to lie within c ± w 
(1 — a)% of the time. 

The covariance may also be defined between multiple 
quantities in a multivariate model, say two fit parameters 
Ci,c 2 : 



cov(ci,c 2 ) 



(ClC 2 ) 



(ci)(c 2 ) 



(G) 



where cov(ci,ci) = var(ci). The covariance provides a 
measure of dependency between c\ , c 2 and when normal- 
ized to the variance of both parameters yields the corre- 
lation 



R c 



cov(ci,c 2 ) 



-y/var(ci)var(c 2 ) 
cov(ci,c 2 ) 



(7) 



Thus, from Eq. ([7]) one may obtain information on the 
sensitivity of some parameter c\ to changes in c 2 . For the 
use of variance and covariance in the analysis of function- 
als post-fitting see, for example, Sect. IVB of Ref. [|| and 
Refs. [13-[43|. 
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Selection of terms based on correlation 



B. Correlation as an indication of independence 



We first briefly describe the method detailed in Ref. @ 
for selecting new terms to be included in the functional. 
We will deal only with the aspect of the method utilizing 
the correlation coefficient as a measuring rod for a new 
term. While the case in Ref. @ is more complex, the 
following nevertheless provides a firm foundation for the 
use of correlation coefficients in functional analysis. 

Consider a functional Fq (c; x) , which has M terms 
/l > • • • > Im that depend on some set of variables x and 
parameters {c}. 



F (c;x) 



M 

E< 

a=l 



Ja(x). 



(8) 



Consider the addition of a term c now / now , with new fit 
parameter c ncw , to the functional. The method is based 
on the expectation that the addition of the term c new /new 
to Fq will be useful in lowering the chi-square only when 
it is independent of the M terms already included in the 
functional. 

This independence is defined through the correlation 
coefficient 



Here the covariance is 



GOV '( f a, J new) 



(9) 



COv(/q., fnew) — (fa fnew) ( fa) (fnew) , 



(10) 



and the average (•) is computed with respect to the 
iVpt s experimental points available for Eq. (fTJ) . Similarly 
the standard deviations are 



~ V (fnew) (fnew)'' 



and 



(fa 2 ) (fa)' 



(11) 



(12) 



We note that the correlation coefficient is independent 
of the coefficient c ne w of the new term under consider- 
ation. Should the correlation be sufficiently low for all 
included terms, the new term c new /new may be included 
in the functional. This allows one to investigate many 
different forms of functional terms without the time- 
consuming and computationally expensive aspects of per- 
forming a full minimization of the least-squares for each 
new term under question. It is an application of the al- 
ready known correlation analyses to a function but prior 
to fitting. 



Both the concepts of correlations and linear indepen- 
dence are familiar to the realm of functional optimiza- 
tion, and where the functional is a solution to Eq. (TTJ) 
the concepts can be seen as meaningful in the develop- 
ment of functionals [H, [35j|. Yet to justify the use of the 
method in Ref. [9| one must ask, is the correlation coef- 
ficient Eq. (JSJ a meaningful indication of the linear in- 
dependence between two functions? This question be- 
comes particularly important when one recalls the fre- 
quent warning to new students of statistics that a zero 
covariance does not imply independence [43j . While this 
statement may seem to condemn the method outright, we 
will address the subtlety that while a zero covariance does 
not imply statistical independence it does in fact indicate 
linear independence of functions of multiple variables in 
the linear algebraic sense. Here, we will demonstrate the 
test on a simple model. In the appendix, we set out the 
necessary formal foundation through a demonstration by 
contradiction. 

We look to the Bethe-Weizsacker mass formula [l[ with 
no pairing, as beyond its historical importance it contains 
four terms of transparent physical meaning: 



a„A 



Z(Z-l) 
1 A 1 / 3 



a-A- 



{N-Zf 



A 



(13) 



The radius of a nucleus is proportional to A 1 / 3 , therefore 
the first term a v A represents a volume and accounts for 
the saturation of the nuclear force. The remaining terms 
then represent a surface term (a s ), a Coulomb contri- 
bution (a c ), an isospin term to account for asymmetry 
between protons and neutrons {a a) and a pairing term 
(ap). The Bethe-Weizsacker mass formula in this form 
achieves a least-squares error of x — 3.0 MeV when fit 
to a set of 2,049 nuclei from the 2003 atomic data eval- 
uation [48| whose uncertainty in the binding energy is 
below 200 keV. By looking at correlations amongst the 
terms in Eq. (|13[) we may reasonably conclude the bulk 
volume and surface components of the nucleus are fully 
represented in this function space, and so Eq. (|13p serves 
as a suitable reference by which to compare possible new 
terms. Table U shows the correlation matrix for Eq. (|13p . 
where one can see that all the terms are highly correlated 
with each other. We define "high" correlation as where 
the absolute value of the correlation coefficient R between 
the new term and the pre-existing terms is greater than 
0.5, \R\ > 0.5. Likewise, "low" correlation is where the 
absolute value of the correlation coefficient between the 
new term and any pre-existing term is never greater than 
0.5, \R\ < 0.5. 

Where terms are highly correlated the variation of the 
function with respect to fit coefficient may be absorbed 
by the simultaneous variation of the other term coef- 
ficients. Physical arguments motivate the inclusion of 
these four terms beyond the indicated redundancy of the 
correlation test (as will be discussed in Sect. lIVB]) . From 
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TABLE I: Correlation matrix for terms of the Bethe- 
Weizsacker mass formula [l[ with no pairing. 



A 


1 




A 2/3 


0.995 


1 


Z(Z-l) 
Al/3 


0.982 


0.968 1 


(N-zy 

A 


0.800 


0.777 0.71942 1 



this, one would seek to include terms which are not cor- 
related to the four in Eq. ([13]l. The fifth term of the full 
Bethe-Weizsacker mass formula is of course the pairing 
term: 



op 



(14) 



where S is 1, 0, and -1 for even-even, odd mass, and odd- 
odd nuclei, respectively. As one can see in Table HH the 
pairing term has a low correlation to all other terms in the 
mass formula, and so is acceptable not only through it's 
well-known physical motivation, but by the correlation 
test. Its addition to the Bethe-Weizsacker mass formula 
reduces the least-squares error by about 0.1 MeV to x — 
2.9 MeV. While a greater change in least-squares error 
may have been expected from a term with such a low 
correlation, the reduction is meaningful [l4| and reasons 
behind it not being great are discussed in Sect. IIVB1 



TABLE II: Correlation matrix for terms of the full Bethe- 
Weizsacker mass formula 1] with pairing. The pairing term 
passes the correlation test for inclusion. 



A 

A 2/3 
Z(Z-l) 
(JV-Z) 2 



A 
S 



1 

0.995 
0.982 
0.800 

-7xl0~ 3 



1 

0.968 1 
0.777 0.71942 

lxl0~ 2 4xl0~ 2 



1 

1x10" 



For our demonstration we take a simplified version of 
the energy functional in Ref . [9j , with fit coefficients c = 

{c c ,C s ,C a s:C ss ,Ci,C 2 ,C 3 }: 

F(c;n,z) = c c Z ^~ 3 ^ + hio f Cl A-^ 3 ^p(z P + n p ) 

^ p 

_|_ C27 4-l/3 X~* ( Zp ( Zp ~ jj 4- n p( n P ~ _|_ 2n P Z ? 



c 3 A 



(15) 



where z p and n p are the occupations of proton and neu- 
tron shells, respectively, and 



Cs ) C-as 5 Css ) 



= i-c s A-^ 

('■as 



T(T+ 1) 



1 



4-1/3 A 2 



(16) 



Our truncated form of the energy functional is taken so 
as to include one- and two-body terms along with the 
Coulomb interaction and a volume term c 3 A. For de- 
tails on the form and origin of Huj we refer the reader 
to Ref. 9]. Furthermore, as we are interested in demon- 
strating the use of correlation testing, we treat F(c; n, z) 
as a mass formula and set the occupations n, z at a naive 
level filling. This removes the multilevel optimization 
necessary to fit the full functional [9J. While for the re- 
mainder of this section we will proceed with a formula, 
the demonstration is extendable to functionals. 

Where there is no multilevel optimization the addition 
of a new term and fit coefficient c new / new can only result 
in a lowering of the least squares error. That is 



N 



pts 



5>( 



C. Tli. Zi 



J3fP) 2 



> 



III. PERFORMANCE OF CORRELATION TEST 

While we have provided motivation for the correlation 
test, along with the necessary mathematical foundation 
for the general case, it is often helpful to view simple ex- 
amples for quantitative insight. In this section we will 
address the potential benefits of using correlation test- 
ing in functional development. The greatest appeal of 
the correlation test lies in its inexpensive computational 
costs. As such, we benchmark the performance of the 
method by looking at the cost of a full minimization of 
a set of terms verses selecting terms from the correlation 
test. As was done in the development of the full energy 
functional in Ref. Q we perform correlation tests on an 
initial set of about 200 different terms and utilize the 
same definition for high and low correlation. 



N, 



pts 



JVpts 

E 



_ 77iCXp\2 

ncwj new ) 



(17) 



as fitting can obtain the left hand side of the inequality 
at c now = 0. For brevity we make use of the average ab- 
solute value of the correlation coefficient, (\R\), where we 
have averaged over the absolute value of the correlation 
coefficient between the new term and all terms already 
present. We emphasize, however, that all individual cor- 
relation coefficients between the new term and any term 
already present are also always either within the regime 
of "high" or "low". 

We perform full minimization over the each new func- 
tional 



F(c; n, z) + c i f l 



(18) 
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A. Detailed Examples 




FIG. 1: Plot of the change in least-squares error, Ax, re- 
sulting from the addition of a new term as a function of the 
term's average correlation coefficient (|-R|). A large number 
of highly correlated terms result in no change to the 

least-squares error. The dotted black line indicates the cut- 
off between high and low correlation. The subset of terms 
with high correlation which contribute to a Ax — 0.1 MeV is 
discussed in Sect. lIVBl 



While the significantly decreased computational cost 
over full minimizations for each test functional is a pro- 
found advantage of the correlation test, we provide fur- 
ther quantitative results for the reduction in least-squares 
error when the method is used by considering a sim- 
ple test case. From the principle axis method and 
Refs. H, HH, we expect linearly independent terms to 
reduce the least-squares error more significantly over lin- 
early dependent terms. We therefore add new terms to 
Eq. (fT5")) . chosen by testing their correlation to terms al- 
ready present, and compare the effect of the new terms 
in lowering the least-squares error as a function of cor- 
relation. From the 200 terms considered above, we have 
selected four (two with low correlations, two with high 
correlations) to demonstrate the method's utility. The 
two terms chosen with high correlation with terms in 
Eq. (JTSJ) are: 



p 



(rip ~\~ Zp^\fip Zp\ 



(19) 
(20) 



where is the i th new term of the set. In Fig. [T] we 
plot the change in least-squares error , Ax, in units of 
MeV as a function of (\R\) for all terms reviewed for this 
paper. The fit is performed over the set of 2,049 nuclei 
from the 2003 atomic data evaluation [48j]. Those with 
a low average correlation coefficient, (\R\) < 0.5, caused 
the greatest change in x, in some cases by two orders of 
magnitude over terms with high correlation, (\R\) > 0.5. 
Of those with low correlation, less than 15% contributed 
a change in \ 01 l ess than 0.15 MeV. It is no surprise the 
majority, 85%, of the terms had high correlation and re- 
duced the least-squares error by 50 MeV or less, whereas 
39% of the terms reduced the least-squares error by 10 
MeV or less. A significant number of terms, 20%, re- 
sulted in a A% = MeV where the coefficient of the 
added term was fit to Ci = 0. In comparing the computa- 
tional time required to minimize each term individually 
to selecting terms via a correlation test we find signifi- 
cant advantage in the correlation test. Minimizing the 
function for a single term required on the order of 170 
CPU seconds, or about one day wall-clock time on a sin- 
gle 3.4 Intel Core i7 processor for the full set of about 
200 terms. In contrast, a correlation test required only 
1.62 CPU seconds to return the correlation analysis of 
all the terms in question. Where it is crucial to minimize 
the number of parameters in a model in order to avoid 
obscuring physical insight [49[ , the correlation test shows 
a clear advantage in providing a substantially less time- 
consuming approach to eliminating large sets of terms 
under consideration. 



where both terms scale as the total number of nucleons 
A = N + Z, with TV and Z the total number of neutron 
and protons, respectively. The new terms chosen with 
low correlation are: 

p \ dp ) 

-|lK-V2f) (2D 
h = c/4^L (22) 

where d p is the dimensionality of the p th shell and rif,Zf 
are the occupations of the highest occupied neutron and 
proton shells, respectively. Here, S is 1, 0, and -1 for 
even-even, odd mass, and odd- odd nuclei, respectively. 
Scaling of terms is at most A 9] . 

We now briefly describe the terms in Eqs. (fH))) - ([2"2"j) as 
they relate to a mass formula. Equation (|19[) may be 
physically motivated for inclusion as a three-body effect, 
whereas Eq. (|20|) may also be physically motivated as 
a density-dependent symmetry energy [50h54| . Likewise, 
Eq. (|2ip may be included in a mass formula to account for 
two-, three-, and four-body deformation effects as it has 
a linear onset for almost-empty and almost-filled shells 
and assumes its maximum at mid-shell with half-filled 
occupations @. Equation is the familiar bulk pair- 
ing interaction of the Bethe-Weizsacker mass formula [l| . 
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As such, all terms may be reasonably considered for in- 
clusion in a mass formula, and indeed Eq. (j2"Tj) and (j2"2")l 
are in the functional of Ref. However, the focus of 
this paper is the effects of using correlation testing and 
so we wish to demonstrate the usefulness of including a 
term when physical arguments are not considered. This 
limitation of the method is discussed in Sect. IIVI 

We determine the parameters c through a least-squares 
fit of the truncated formula in Eq. (IT51) to a set of 2,049 
nuclei from the 2003 atomic data evaluation j48} whose 
uncertainty in the binding energy is below 200 keV. The 
resulting least-squares error is \ = 1.687 MeV, with a 
total of seven fit coefficients. We add an eighth fit coef- 
ficient with the inclusion of one of the above described 
terms, Eqs. (fl~9|) - (|22[) . results of which are shown in Ta- 
ble IIIII Based on the method of correlation testing we 
would not expect the addition of the f± or f% to result 
in a significant decrease in the x value. Indeed, adding 
fi and refitting to the parameters c to the 2,049 nuclei 
gives x — 1-686 MeV, or a reduction of only A% = 0.001 
MeV in the least-squares error. Adding f 2 and refitting 
the formula gives x — 1-686 MeV, the same small reduc- 
tion as with /i. Thus, based purely on the correlation 
test the terms f\ and f% would not be included in the 
development of a mass formula. 

We must now, however, demonstrate that those terms 
with low correlation are beneficial to the development of 
a mass formula. Proceeding as before, we add fa and refit 
the formula obtaining a least-squares error of x = 1-491 
MeV. This constitutes a reduction in x over the formula 
in Eq. ([15]) of Ax = 0.196 MeV. While the formula is 
still far from competitive, the reduction is large by mass 
formula standards [l4j . The addition of f± gives a sim- 
ilarly dramatic change. Refitting the formula with f± 
gives x = 1-444 MeV, a reduction in x of A% = 0.243 
MeV. Both terms fa and fi cause reductions in x two 
orders of magnitude greater than those changes by the 
highly correlated terms f\ and f%- Furthermore, the cor- 
relation test in principle eliminated the two functions 
which provided the least reduction, decreasing the set 
of terms to undergo a full minimization in a method that 
is less computationally expensive than a singular value or 
eigen decomposition. In the case of this formula, the fit- 
ting procedure is not costly. However, limiting the num- 
ber of candidate terms becomes increasingly important 
for computationally expensive fits (for a discussion on 
computational expense see Ref. 55]). The above exam- 
ple provides a demonstration of how the correlation test 
provides insight into what candidate terms may reduce 
the least-squares error of a mass formula by the great- 
est amount without the computational expense of a full 
minimization. 



IV. LIMITATIONS 

On equal footing with the benefits of the correlation 
test we will also discuss the limitations of the method, 



TABLE III: Least-squares deviations of binding energies re- 
sulting from a global fit of the formula to 2,049 nuclei [48j ] . 
The left-hand column identifies the form of the formula in 
terms of which of the four terms fx, fi , fa, fi are added to 
Eq. (|15[) . The right-most column gives the average correla- 
tion (\R\) of the added term relative to terms in Eq. (I15|l . 
Correlations may be thought of as either low (< 0.5) or high 
(>0.5). 



Form 


X (MeV) 


(\R\) 


F(c-n,z] 




1.687 




F(c; n, z) + 


fx 


1.686 


~ 0.998 


F(c; n, z) + 


h 


1.686 


~ 0.815 


F(c; n, z) + 


h 


1.491 


~ 0.405 


F(c; n, z) + 


h 


1.444 


~ 0.004 



which have been indicated in the previous section. As 
in Sect. IIIII we will highlight the possible limitations of 
correlation testing quantitatively. 



A. Degree of Indepedence 

Bertsch et al. [HI order the importance of principle 
axes of the functional by the eigenvalues of their eigen 
decomposition. In the case of correlation testing one 
may be tempted to interpret the absolute value of the 
correlation coefficient |i?| as an indication of the level of 
dependence between two terms. However, the authors of 
Refs. [H, H3] warn against such a general interpretation. 
There is a need for context when giving meaning to the 
value of \R\ and definitions of "high" and "low" corre- 
lation are somewhat arbitrary. References [HH and [59| 
demonstrate the covariance as isomorphic to the cosine 
of a "correlation angle" between two statistical quantities 
when var(xi), var(x2) 7^ 0, and x±,X2 G K. However, the 
authors again caution that such an interpretation pro- 
vides no information on the linear dependence we seek to 
find in functional development. 

Figure [TJ similarly indicates this limitation of the cor- 
relation test for the full set of about 200 terms. One may 
define a vertical division (black dotted line of Fig. [TJ be- 
tween "high" and "low" correlation within the context of 
the problem, but the trend below (\R\) = 0.5 is primar- 
ily a flat line, not indicating a reliable statement on the 
degree of independence. So while the method of Ref. [35[ 
may be more computationally expensive, the correlation 
test has significantly reduced the number of terms where 
it may be feasibly used. 



B. Limited Considerations 

There are a few other technical issues of which the 
would-be user must be wary. The first of these we men- 
tion is the inherent linearity of the correlation coefficient. 
As one can see from Eq. (JT]), the correlation coefficient 
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provides a constant ratio between its arguments c\ , C2 and 
thus is only a measure of the linear relationship between 
two variables [4^, [H3| - As a result, the correlation coef- 
ficient contains no knowledge of the non-linear behavior 
that exists between two variables. In fact, this linearity 
contributes to a far more fundamental (and dangerous) 
problem with the correlation test: it lacks any knowl- 
edge of the physics behind the functional or formula be- 
ing built. This, perhaps most obvious, issue is manifest 
in two ways. We start by quantitatively demonstrating 
the first: physically important terms may be omitted due 
to high correlations. 

To demonstrate that some terms which may con- 
tain important physical properties might be rejected by 
the correlation test we employ once again the Bethe- 
Weizsacker mass formula. 

TABLE IV: Average correlation (\R\) of the terms in the 
Bethe-Weizsacker mass formula [lj, Eq. (|15p. and the y value 
resulting a fit of the mass formula to the 2,049 nuclei [43] with 
the term removed. 



Term 


i\R\) 


X (MeV) without term 


A 


0.7 


38.04 


Z(Z-l) 

AVa 
(N~Z) 2 
A 


0.7 
0.7 
0.6 


14.01 
25.2 
23.19 


S 


0.008 


3.0 



All of the terms in the Bethe-Weizsacker mass formula 
are correlated to each other with an average correlation 
coefficient of at least (\R\) ~ 0.6, except for the pairing 
term for which (\R\) ~ 8 x 10 -3 . Despite the volume, 
surface, Coulomb and isospin terms having "high" cor- 
relation by the definition used for Eq. (|15p and Ref. @ 
they are of clear physical importance in describing the 
bulk behavior of nuclei • Furthermore, their removal 
increases the least-squares error of the Bethe-Weizsacker 
mass formula from \ — 2-9 MeV to between y = 14 — 38 
MeV when fitted to the set of 2,049 nuclei [Hj]. Details 
of the effect of each term on x is gi ye i n Table IIVI The 
lack of impact on \ by the pairing term is not surpris- 
ing, as it scales as A^ 1 / 2 , much lower than other terms 
and so scaling must be considered by the user fgj. That 
highly correlated terms contribute to a better description 
of nuclear binding should come as no surprise. The lin- 
earity of the correlation coefficient contains no knowledge 
of higher-order effects in the nuclear interaction, which 
are necessary for accurate description of nuclei [hH IrjlT - 
l63l |. This is seen as well in Fig. Q] where a single out- 
lier appears at (\R\) — 0.25 that does not greatly con- 
tribute to reducing x- Evaluation of this term indicates 
a higher-order effect, not identifiable by the correlation 
test. So while a term may be highly correlated at first or- 
der, the relationship at higher-order behavior is unknown 
and thus not necessarily redundant. Such an effect is also 
seen in Fig. [TJ where a certain subset of terms with high 
correlation — 0.9) provide a reduction in x on the 



order of Ay ~ 0.1 MeV. 

The converse of missing physical terms is true as well, 
where the correlation test may retain terms with no phys- 
ical meaning due to a low correlation coefficient. One 
such term which may be added to the formula of Eq. (fTS")) 
is 



z f n f 

which is a symmetric ratio of proton and neutron oc- 
cupations in the highest occupied orbital. The term 
in Eq. (f2"3"|) has a low average correlation coefficient 
(\R\) — 0.1 and so represents a linearly independent di- 
rection in the function space. However, this term has 
no physical meaning in the context of nuclear binding. 
Thus, despite any potential ability to reduce x this term 
should not be considered for use in a formula for nuclear 
masses. So while the linearity of the correlation test does 
not eliminate its value as an aid in function development, 
the user must apply context and insight in its use. 

V. CONCLUSIONS 

Nuclear DFT has proven to be a useful tool for globally 
describing the ground-state properties of nuclei, yet the 
construction of the functional itself poses a significant 
challenge. The correlation test introduced in Ref. @ 
provides a computationally inexpensive method of dis- 
criminating amongst potential terms to include in a nu- 
clear mass formula or functional. In this paper we 
have provided the necessary mathematical foundation of 
the method (Sect. Ill Bp that has hitherto been missing, 
and provided quantitative characterization or its perfor- 
mance. 

In Sect. Mil we have developed the benefits of the corre- 
lation test by benchmarking the considerably low compu- 
tational expense. In Sect. IIII Al we further demonstrated 
the performance of the correlation test in selecting ap- 
propriate terms for the development of a nuclear mass 
functional. The inclusion of terms with low correlations 
led to a reduction in the least-squares error on the order 
of Ay ~ 0.1 MeV, compared to reductions on the order 
of Ay ~ 0.001 MeV of highly correlated terms. 

In order to provide a complete characterization of cor- 
relation testing, in Sect. IIVI we turn to the limitations 
of the method. A quantitative description is presented 
in Sect. IIV Al demonstrating the methods issues in pro- 
viding a measure of the degree of independence between 
any two terms in a function. In Sect. IIV Bl we further 
demonstrate the need for physical insight in the use of 
the correlation test. The correlation test has no knowl- 
edge of the necessary physics, and so the cautious user 
is not free of the detailed considerations necessary in the 
field of mass modeling. 

With this more rigorous foundation, the method of cor- 
relation testing may be elevated from a "quick-and-dirty" 
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method to simply a "quick" method for the systematic 
development of functional and formulae in the study of 
nuclear masses. While the method is computationally 
cheap and provides a reliable indication of principle axes 
in a functional space, it is not without limitations. Im- 
provements on the method may come from a more rigor- 
ous treatment of the statistical foundation and an inves- 
tigation of the effects of adding more than a single term 
at a time. The inclusion of rank correlation [64-66] may 
be implemented in order to remove the inherent linearity 
of the test, and is another avenue by which the method 
can be enhanced. 
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Appendix: Demonstration by Contradiction 

The mathematical foundation for the correlation test 
as an indication of linear independence is set forth 
here. We consider two variables x\,x 2 and two func- 
tions yi(xi, x 2 ), y 2 (xi, x 2 ) which depend on X\,x 2 . We 
will demonstrate by contradiction that y\ and y 2 are lin- 
early independent in x\ and x 2 if their covariance is zero. 
This differs from statistical independence as y\ and y 2 are 
clearly dependent on the same random variables x± and 
x 2 . 

We begin by setting out to show that if: 



cov(yi,y 2 ) = 



(A.l) 



then y\ and y 2 are not linearly dependent. To demon- 
strate by contradiction we start by assuming j/i and y 2 
are linearly dependent and attempt to show their covari- 
ance can still be zero only with a mathematical contra- 
diction. 

Let 2/i)J/2 be two linear functions of X\,x 2 that are 
linearly dependent: 



2/1 

2/2 



ax 1 
cxi - 



- 6x 2 

dx 2 



(A.2) 
(A.3) 



The coefficients a, b, c, d must satisfy the condition for 
linear dependence: 



a b 
c d 



(A.4) 



where | | indicates the determinant. As we are interested 
in the application of this method to the development of 
nuclear functionals and formulae, and for those in which 
the correlation coefficient is defined, we set the following 
assumptions: 

1. var(xi), var(x2) 7^ 

2- 2/1,2/2 G K 

3. xi, x 2 G K 

4. a, 6, c, d G K 

We now calculate cov(yi, y 2 ): 



where 



cov(2/i, 2/2) = (2/12/2) - (2/1X2/2) (A.5) 



( Vl ) = a(x 1 )+b(x 2 ) (A.6) 
(2/2) = c{xi) + d(x 2 ) (A.7) 
(2/12/2) = ac(x\) +bd(xj) + (ad + be) (x x x 2 ) (A.8) 

Putting Eqs. (|AT6 ]) -(|AT8 j) into Eqn (|A~5j) we obtain: 

cov(2/i, 2/2) = ac(x\) + bd(xl) + (ad + bc)(xix 2 ) 

— ac(xi) 2 — bd(x2) 2 — (ad + bc)(x\)(x 2 ) 

(A.9) 

Some simplification gives: 

cov(2/i,2/2) = ac((x\) - ( Xl ) 2 ) +bd((x 2 2 ) - (x 2 ) 2 ) 
+(ad + bc)((xix 2 ) - (x 1 )(x 2 )) 
= acvax(xi) + bdv&r(x 2 ) 

+(ad + bc)cov(xx,x 2 ) = (A. 10) 



Utilizing the Eq. (|A.4[) . we have the following relation- 
ship: 



ad-bc = 0. (A.ll) 
Using this relation in Eq. (|A.10[) we may rewrite as 



— c 2 var(iri) + bdva,r(x 2 ) + 2bccov(x\, x 2 ) = 0. (A. 12) 
d 



We treat Eq. (|A.12|) as a quadratic equation in c, and 
attempt to solve for the coefficient. 



c =- 



± 



-bcov(xi, x 2 ) 
|var(xi) 

-\/fr 2 [cov(a;i, x 2 )} 2 — b 2 va,r(xi)vai(x 2 ) 
|var(a;i) 



(A.13) 
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In order to satisfy the initial assumption eel we must 
verify the discriminant of Eq. (|A.13|) is non-negative, 

6 2 [cov(a;i,a;2)] 2 - 6 2 var(xi)var(a;2) > (A. 14) 
Eq. (|A.14|) simplifies to 

[cov(xi, x 2 )] > var(xi)var(a;2) (A. 15) 

However, this is a contradiction as the following relation- 
ship is known, via the Cauchy-Schwarz inequality [671 ] 

[cov(xi, X2)] 2 < var(xi)var(a;2) (A. 16) 

where the equality only holds true when X2 is a multiple 
of Xi, X2 = px\. In the case of an equality in Eq. (|A.16I) . 
we can rewrite 2/1,2/2: 



cov(j/i,y 2 ) = cw{fxx,gxi) (A. 19) 

= /.gvar^x) (A.20) 



Where we have assumed var(xi), var(x 2 ) 7^ 0, Eq. (|A.20|) 
is zero only for fg = 0. Thus, cov(z/i,?/ 2 ) = only when 
2/1 = or j/2 = 0. Where either 2/1 = or 2/2 = the 
events 2/1 , 2/2 are not linearly dependent and we arrive at 
another contradiction. 

Therefore, 



cov(2/i,2/ 2 ) = (A.21) 



2/1 = ax\ + bftxi — fxi (A. 17) 

2/2 = cx\ + d/3xi — gx\ (A. 18) 

where / = a+b/3 and g — x+df3. Making use of the prop- 
erties cov(aa;, y) = acov(x, y) and cov(a;, x) = var(x), the 
covariance is 



only in the case of 2/1 =0 or 2/2 = 0, or c ^ R. We 
find a contradiction in our initial assumptions and if 
cov(2/i,2/2) = then 2/1,2/2 & re n °t linearly dependent 
when a, 6, c, d € R, despite a statistical dependence. 
The extension of this demonstration to more variables 
is straightforward and so we omit it here. 
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